Address reduction blindly identifies non-random data series

نویسندگان

  • Thomas M. A. Fink
  • Francis Brown
  • Karen Willbrand
  • Sebastian E. Ahnert
چکیده

We introduce a method of detecting data series (curves) which exhibit pattern without knowing what kind of pattern they contain. By partitioning the space of curves into neighbourhoods, we show that the curves with the shortest addresses are the most likely to result from simple underlying mechanisms. We show that address reduction is a bound on Kolmogorov complexity and is invariant over noise and one-to-one transformations. We use it to blindly identify gene expression profiles in yeast cell cycle and the segmentation clock, and to segregate humanand computer-generated random data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dimensionality Reduction for Indexing Time Series Based on the Minimum Distance

We address the problem of efficient similarity search based on the minimum distance in large time series databases. To support minimum distance queries, most of previous work has to take the preprocessing step of vertical shifting. However, the vertical shifting has an additional overhead in building index. In this paper, we propose a novel dimensionality reduction technique for indexing time s...

متن کامل

On the Detection of Trends in Time Series of Functional Data

A sequence of functions (curves) collected over time is called a functional time series. Functional time series analysis is one of the popular research areas in which statistics from such data are frequently observed. The main purpose of the functional time series is to predict and describe random mechanisms that resulted in generating the data. To do so, it is needed to decompose functional ti...

متن کامل

Modified Maximum Likelihood Estimation in First-Order Autoregressive Moving Average Models with some Non-Normal Residuals

When modeling time series data using autoregressive-moving average processes, it is a common practice to presume that the residuals are normally distributed. However, sometimes we encounter non-normal residuals and asymmetry of data marginal distribution. Despite widespread use of pure autoregressive processes for modeling non-normal time series, the autoregressive-moving average models have le...

متن کامل

Efficient Non-Oblivious Randomized Reduction for Risk Minimization with Improved Excess Risk Guarantee

In this paper, we address learning problems for high dimensional data. Previously, oblivious random projection based approaches that project high dimensional features onto a random subspace have been used in practice for tackling highdimensionality challenge in machine learning. Recently, various non-oblivious randomized reduction methods have been developed and deployed for solving many numeri...

متن کامل

1-D random landscapes and non-random data series

We study the simplest random landscape, the curve formed by joining consecutive data points f1, . . . , fN+1 with line segments, where the fi are i.i.d. random numbers and fi = fj . We label each segment increasing (+) or decreasing (−) and call this string of +’s and −’s the up-down signature σ. We calculate the probability P (σ(f)) for a random curve and use it to bound the algorithmic inform...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006